43 research outputs found
Finding Objects of Interest in Images using Saliency and Superpixels
The ability to automatically find objects of interest in images is useful in the areas of compression, indexing and retrieval, re-targeting, and so on. There are two classes of such algorithms – those that find any object of interest with no prior knowledge, independent of the task, and those that find specific objects of interest known a priori. The former class of algorithms tries to detect objects in images that stand-out, i.e. are salient, by virtue of being different from the rest of the image and consequently capture our attention. The detection is generic in this case as there is no specific object we are trying to locate. The latter class of algorithms detects specific known objects of interest and often requires training using features extracted from known examples. In this thesis we address various aspects of finding objects of interest under the topics of saliency detection and object detection. We present two saliency detection algorithms that rely on the principle of center-surround contrast. These two algorithms are shown to be superior to several state-of-the-art techniques in terms of precision and recall measures with respect to a ground truth. They output full-resolution saliency maps, are simpler to implement, and are computationally more efficient than most existing algorithms. We further establish the relevance of our saliency detection algorithms by using them for the known applications of object segmentation and image re-targeting. We first present three different techniques for salient object segmentation using our saliency maps that are based on clustering, graph-cuts, and geodesic distance based labeling. We then demonstrate the use of our saliency maps for a popular technique of content-aware image resizing and compare the result with that of existing methods. Our saliency maps prove to be a much more effective replacement for conventional gradient maps for providing automatic content-awareness. Just as it is important to find regions of interest in images, it is also important to find interesting images within a large collection of images. We therefore extend the notion of saliency detection in images to image databases. We propose an algorithm for finding salient images in a database. Apart from finding such images we also present two novel techniques for creating visually appealing summaries in the form of collages and mosaics. Finally, we address the problem of finding specific known objects of interest in images. Specifically, we deal with the feature extraction step that is a pre-requisite for any technique in this domain. In this context, we first present a superpixel segmentation algorithm that outperforms previous algorithms in terms quantitative measures of under-segmentation error and boundary recall. Our superpixel segmentation algorithm also offers several other advantages over existing algorithms like compactness, uniform size, control on the number of superpixels, and computational efficiency. We prove the effectiveness of our superpixels by deploying them in existing algorithms, specifically, an object class detection technique and a graph based algorithm, and improving their performance. We also present the result of using our superpixels in a technique for detecting mitochondria in noisy medical images
Deep Residual Network for Joint Demosaicing and Super-Resolution
In digital photography, two image restoration tasks have been studied
extensively and resolved independently: demosaicing and super-resolution. Both
these tasks are related to resolution limitations of the camera. Performing
super-resolution on a demosaiced images simply exacerbates the artifacts
introduced by demosaicing. In this paper, we show that such accumulation of
errors can be easily averted by jointly performing demosaicing and
super-resolution. To this end, we propose a deep residual network for learning
an end-to-end mapping between Bayer images and high-resolution images. By
training on high-quality samples, our deep residual demosaicing and
super-resolution network is able to recover high-quality super-resolved images
from low-resolution Bayer mosaics in a single step without producing the
artifacts common to such processing when the two operations are done
separately. We perform extensive experiments to show that our deep residual
network achieves demosaiced and super-resolved images that are superior to the
state-of-the-art both qualitatively and in terms of PSNR and SSIM metrics
Deep Feature Factorization For Concept Discovery
We propose Deep Feature Factorization (DFF), a method capable of localizing
similar semantic concepts within an image or a set of images. We use DFF to
gain insight into a deep convolutional neural network's learned features, where
we detect hierarchical cluster structures in feature space. This is visualized
as heat maps, which highlight semantically matching regions across a set of
images, revealing what the network `perceives' as similar. DFF can also be used
to perform co-segmentation and co-localization, and we report state-of-the-art
results on these tasks.Comment: The European Conference on Computer Vision (ECCV), 201
Superpixels and Polygons using Simple Non-Iterative Clustering
We present an improved version of the Simple Linear Iterative Clustering (SLIC) superpixel segmentation. Unlike SLIC, our algorithm is non-iterative, enforces connectivity from the start, requires lesser memory, and is faster. Relying on the superpixel boundaries obtained using our algorithm, we also present a polygonal partitioning algorithm. We demonstrate that our superpixels as well as the polygonal partitioning are superior to the respective state-of-the-art algorithms on quantitative benchmarks
Self-Binarizing Networks
We present a method to train self-binarizing neural networks, that is,
networks that evolve their weights and activations during training to become
binary. To obtain similar binary networks, existing methods rely on the sign
activation function. This function, however, has no gradients for non-zero
values, which makes standard backpropagation impossible. To circumvent the
difficulty of training a network relying on the sign activation function, these
methods alternate between floating-point and binary representations of the
network during training, which is sub-optimal and inefficient. We approach the
binarization task by training on a unique representation involving a smooth
activation function, which is iteratively sharpened during training until it
becomes a binary representation equivalent to the sign activation function.
Additionally, we introduce a new technique to perform binary batch
normalization that simplifies the conventional batch normalization by
transforming it into a simple comparison operation. This is unlike existing
methods, which are forced to the retain the conventional floating-point-based
batch normalization. Our binary networks, apart from displaying advantages of
lower memory and computation as compared to conventional floating-point and
binary networks, also show higher classification accuracy than existing
state-of-the-art methods on multiple benchmark datasets.Comment: 9 pages, 5 figure
Deep Feature Factorization for Concept Discovery
We propose Deep Feature Factorization (DFF), a method capable of localizing similar semantic concepts within an image or a set of images. We use DFF to gain insight into a deep convolutional neural network's learned features, where we detect hierarchical cluster structures in feature space. This is visualized as heat maps, which highlight semantically matching regions across a set of images, revealing what the network 'perceives' as similar. DFF can also be used to perform co-segmentation and co-localization, and we report state-of-the-art results on these tasks
Face Recognition in Real-world Images
Face recognition systems are designed to handle well-aligned images captured under controlled situations. However real-world images present varying orientations, expressions, and illumination conditions. Traditional face recognition algorithms perform poorly on such images. In this paper we present a method for face recognition adapted to real-world conditions that can be trained using very few training examples and is computationally efficient. Our method consists of performing a novel alignment process followed by classification using sparse representation techniques. We present our recognition rates on a difficult dataset that represents real-world faces where we significantly outperform state-of-the-art methods
Deep Residual Network for Joint Demosaicing and Super- Resolution
The two classic image restoration tasks, demosaicing and super-resolution, have traditionally always been studied indepen- dently. That is sub-optimal as sequential processing, demosaic- ing and then super-resolution, may lead to amplification of ar- tifacts. In this paper, we show that such accumulation of er- rors can be easily averted by jointly performing demosaicing and super-resolution. To this end, we propose a deep residual net- work for learning an end-to-end mapping between Bayer images and high-resolution images. Our deep residual demosaicing and super-resolution network is able to recover high-quality super- resolved images from low-resolution Bayer mosaics in a single step without producing the artifacts common to such processing when the two operations are done separately. We perform exten- sive experiments to show that our deep residual network achieves demosaiced and super-resolved images that are superior to the state-of-the-art both qualitatively and quantitatively
Extreme Image Completion
It is challenging to complete an image whose 99 percent pixels are randomly missing. We present a solution to this extreme image completion problem. As opposed to existing techniques, our solution has a computational complexity that is linear in the number of pixels of the full image and is real-time in practice. For comparable quality of reconstruction, our algorithm is thus almost 2 to 5 orders of magnitude faster than existing techniques